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CLAIMS 

What is claimed is: 

1 . A system for managing audio information, comprising: 

a fingerprinting component that maps portions of a plurality of audio files to 
corresponding fingerprints; and 

a detector that tags one or more of the audio files for potential removal from a 
data storage device based in part upon a distance between the fingerprints. 

2. The system of claim 1, the detector tags the audio files based upon the distance 
between fingerprints being below a predetermined threshold. 

3 . The system of claim 1 , the fingerprinting component further producing a plurality 
of fingerprints for a file, the plurality of fingerprints corresponding to a time window of 
audio in the file, and wherein the detector tags the audio files based upon a lowest 
distance between the plurality of fingerprints and one or more stored fingerprints for each 
file. 

4. The system of claim 3, the fingerprinting component is disposed to accept a time 
offset into the audio file and a duration of a time window in the files. 

5 . The system of claim 1 , the fingerprint component computes fingerprints that are 
generated from more than one second of audio, and that consist of about 64 floating point 
numbers. 
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6. The system of claim 1, the detector utilizes at least two internal databases referred 
to as DB1 and DB2, in DB1, a record comprises a fingerprint and associated numerical 
quantities including a normalization factor, in DB2, a record includes at least four 
objects: a filename, an associated index referred to as an ID index, an ‘offset’ parameter 
and a ‘distance’ parameter. 

7. The system of claim 4, the detector computes and compares all fingerprints in a 
time window, in order to find a best matching location in a file that has already been 
processed. 

8. The system of claim 7, the detector is employed for determining an identity of an 
audio file. 

9. The system of claim 8, the identity is composed of metadata associated with an 
audio file. 

10. The system of claim 1, further comprising a database that is employed to output a 
list of duplicate or defective audio files to a user interface. 

11. The system of claim 10, the detector logs error conditions while processing the 
audio files and outputs a list of files associated with the error conditions to the user 
interface. 

12. The system of claim 1, further comprising a database for storing veto fingerprints 
that are employed to identify noisy audio files. 

13. A computer readable medium having computer readable instructions stored 
thereon for implementing the fingerprinting component and the detector of claim 1 . 
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14. A user interface for managing audio files, comprising: 

a display component providing one or more options for potential audio files to 
remove from a database; and 

an input component to at least one of: select the options, and configure an 
automated audio pruning component that determines the potential audio files for removal. 

15. The user interface of claim 14, further comprising a component for organizing 
files for possible deletion and a display to allow a user to select which of the files to 
delete. 

16. The user interface of claim 15, further comprising a component to at least one of 
offer the user the ability to save one or more of duplicate files based on quality 
comparisons between files, apply preferential treatment to files based on an encoding 
associated with the files, and apply preferential treatment based on digital rights 
management associated with the files. 

17. The user interface of claim 16, further comprising a component that presents the 
user with various levels of warning, based on how confident a duplicate detector is that 
the files are in fact duplicates. 

18. The user interface of claim 16, further comprising a ‘fast browse’ component to 
compare duplicate files. 

19. The user interface of claim 16, further comprising an option to request that a 
duplicate detector be run on files using many fingerprints at different locations, to 
determine which parts of the files are duplicated. 
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20. A method to facilitate audio file management, comprising: 
accepting time parameters to search an audio file; 
determining fingerprints for the audio file; and 

employing the time parameters and the fingerprints to determine audio files to 
potentially remove from a database. 

2 1 . The method of claim 20, further comprising recursively processing audio files in a 
directory tree and determining a normalization factor for each of the fingerprints. 

22. The method of claim 20, further comprising creating a set of traces for audio files, 
and checking the traces against a set of fingerprints created for other audio files. 

23. The method of claim 20, further comprising concurrently creating fingerprints and 
checking for duplicates in a single pass through the audio file data 

24. The method of claim 20, further comprising employing one or more veto 
fingerprints to determine a noisy file. 

25. The method of claim 20, further comprising processing the audio file in at least 
two layers, where the output of a first layer depends on a log spectrum computed over a 
small window and a second layer operates on a vector computed by aggregating vectors 
produced by the first layer. 

26. The method of claim 25, further comprising providing a wider temporal window 
in a subsequent layer than a proceeding layer. 

27. The method of claim 25, further comprising employing at least one of the layers 
to compensate for time misalignment between files. 
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